# Zero-shot learning
The Teacher V 2
A zero-shot classification model based on the Transformers architecture that can classify text without fine-tuning
Large Language Model
Transformers

T
shiviktech
196
0
F0
This is an automatically generated transformers model card, and specific information is to be supplemented.
Large Language Model
Transformers

F
vdmbrsv
2,602
1
Sarvam Finetune
This is a transformers model published on Hub. The specific functions and detailed information are to be supplemented.
Large Language Model
Transformers

S
jk12p
112
1
Um P2 Fine Tuned Llama Full 2
This is a transformers model that has been pushed to the Hub. The specific functions and uses are to be supplemented.
Large Language Model
Transformers

U
ElijahLiew2
152
1
Yinglong 110m
YingLong is a pretrained model for time series forecasting, which has been pretrained on 78 billion time points and provides strong support for time series forecasting tasks.
Climate Model
Safetensors
Y
qcw2333
348
0
Uzmi Gpt
Apache-2.0
GPT-2 is an open-source language model developed by OpenAI, based on the Transformer architecture, capable of generating coherent text.
Large Language Model English
U
rajan3208
30
2
Xlm Roberta Large Pooled Cap Media Minor
MIT
A multilingual text classification model fine-tuned based on xlm-roberta-large, supporting English and Danish, focusing on political agenda and media content classification tasks.
Text Classification
PyTorch Other
X
poltextlab
163
0
CLIP ViT L Rho50 K1 Constrained FARE2
MIT
A feature extraction model fine-tuned based on openai/clip-vit-large-patch14, optimizing the image and text encoders
Multimodal Fusion
Transformers

C
LEAF-CLIP
253
0
Style 250412.vit Base Patch16 Siglip 384.v2 Webli
A vision model based on the Vision Transformer architecture, trained using SigLIP (Sigmoid Loss for Language-Image Pretraining), suitable for image understanding tasks.
Image Classification
Transformers

S
p1atdev
66
0
Llama 3.1 8B AthenaSky MegaMix
Apache-2.0
An 8B-parameter large language model fused via MergeKit from multiple high-quality models, optimized for reasoning, dialogue, and creative generation
Large Language Model
Transformers English

L
ZeroXClem
105
2
Ibm Granite.granite Vision 3.2 2b GGUF
Granite Vision 3.2 2B is a vision-language model developed by IBM, focusing on image-to-text tasks.
Image-to-Text
I
DevQuasar
211
1
Yoloe
YOLOE is an efficient, unified, and open model for object detection and segmentation, supporting various prompting mechanisms including text, visual inputs, and prompt-free paradigms, achieving real-time universal visual perception.
Object Detection
Y
jameslahm
40.34k
32
Bytedance Research.ui TARS 72B SFT GGUF
A 72B-parameter multimodal foundation model released by ByteDance Research, specializing in image-text-to-text tasks
Image-to-Text
B
DevQuasar
81
1
Whisper Large V3 Turbo
MIT
Whisper large-v3-turbo is an automatic speech recognition and speech translation model proposed by OpenAI, trained with large-scale weak supervision and supporting multiple languages.
Speech Recognition
Transformers Supports Multiple Languages

W
Daemontatox
26
1
Lamarckvergence 14B
Apache-2.0
Lamarckvergence-14B is a pre-trained language model merged via mergekit, combining Lamarck-14B-v0.7 and Qwenvergence-14B-v12-Prose-DS. It ranks first among models with fewer than 15B parameters on the Open LLM Leaderboard.
Large Language Model
Transformers English

L
suayptalha
15.36k
24
Depthmaster
Apache-2.0
DepthMaster is a refined single-step diffusion model that customizes generative features from diffusion models for discriminative depth estimation tasks.
3D Vision English
D
zysong212
50
9
Minimax Text 01
This model is a text generation model capable of producing coherent text content based on input prompts.
Text Generation
M
MiniMaxAI
8,231
580
Resnet101 Clip Gap.openai
Apache-2.0
ResNet101 image encoder based on CLIP framework, extracting image features through Global Average Pooling (GAP)
Image Classification
Transformers

R
timm
104
0
Ioskef 23 11 06
MIT
This is a model checkpoint provided for the Any-to-Any subnet collaboration between OMEGA Labs and Bittensor, aiming to achieve general artificial intelligence tasks.
Large Language Model Other
I
louistvc
0
0
Ioskef 23 11 05
MIT
The Any-to-Any Subnet Model jointly developed by OMEGA Labs and Bittensor, focusing on general artificial intelligence tasks.
Large Language Model Other
I
louistvc
0
0
Florence 2 Large Ft Safetensors
MIT
Florence-2 is an advanced visual foundation model developed by Microsoft, employing a prompt-based architecture to unify various vision and vision-language tasks
Image-to-Text
F
mrhendrey
162
2
Ttm Research R2
A compact pre-trained model for multivariate time series forecasting open-sourced by IBM Research, with parameter scales starting from 1 million, pioneering the concept of 'tiny' pre-trained time series forecasting models.
Climate Model
Safetensors
T
ibm-research
400
2
Show O W Clip Vit
MIT
Show-o is a PyTorch-based any-to-any conversion model focused on multimodal task processing.
Text-to-Image
S
showlab
18
2
Show O
MIT
Show-o is an any-to-any conversion model based on PyTorch, supporting input and output conversion across multiple modalities.
Text-to-Video
S
showlab
225
16
Speecht5 Base Cs Tts
This is a monolingual Czech SpeechT5 base model, pre-trained on 120,000 hours of Czech audio and a 17.5 billion-word text corpus, designed as a starting point for Czech TTS fine-tuning.
Speech Synthesis
Transformers Other

S
fav-kky
66
0
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
A sparse retrieval model based on distillation technology, optimized for OpenSearch, supporting inference-free document encoding with improved search relevance and efficiency over V1
Text Embedding
Transformers English

O
opensearch-project
1.8M
7
Controlnet Union Sdxl 1.0
Apache-2.0
All-round image generation and editing control network, supporting 12 control conditions and 5 advanced editing functions
Image Generation
C
xinsir
156.68k
1,413
Bitnet B1 58 Xl Q8 0 Gguf
MIT
BitNet b1.58 is a large language model with 1.58-bit quantization. It reduces the computational resource requirements by lowering the weight precision while maintaining performance close to that of a full-precision model.
Large Language Model
Transformers

B
BoscoTheDog
326
7
Florence 2 Large Ft
MIT
Florence-2 is an advanced vision foundation model developed by Microsoft, employing a prompt-based approach to handle various vision and vision-language tasks.
Image-to-Text
Transformers

F
andito
93
4
Depth Anything V2 Base Hf
Depth Anything V2 is currently the most powerful monocular depth estimation model, trained on 595,000 synthetically annotated images and over 62 million real unlabeled images, offering finer details and stronger robustness.
3D Vision
Transformers

D
depth-anything
47.73k
1
Badger Lambda Llama 3 8b
Badger is a Llama3 8B instruction model generated through recursive maximum pairwise disjoint normalized denoising Fourier interpolation method, incorporating features from multiple high-quality models.
Large Language Model
Transformers

B
maldv
24
11
Mobileclip S2 Timm
MobileCLIP-S2 is an efficient image-text model that achieves rapid inference through multimodal reinforcement training, delivering outstanding zero-shot performance while maintaining a compact size.
Text-to-Image
M
apple
147
4
Mmfreelm 2.7B
Apache-2.0
Large Language Model
Transformers

M
ridger
89
35
Compare2score
MIT
Compare2Score is a model for image quality assessment that provides a quality score for images through a specific algorithm.
Image Enhancement
Transformers

C
q-future
391
4
Flamingo 2024
MIT
This model is released under the MIT license, specific details are currently unknown.
Large Language Model
Transformers

F
babylm
6,526
1
Llama 3 8b Ita
An Italian large language model optimized based on Meta-Llama-3-8B, supporting English and Italian text generation tasks
Large Language Model
Transformers Supports Multiple Languages

L
DeepMount00
16.00k
27
Nl2sql 7b
Apache-2.0
This is an open-source model using the Apache-2.0 license; specific details need to be supplemented
Large Language Model
Transformers

N
DMetaSoul
47
1
Strangemerges 53 7B Model Stock
Apache-2.0
StrangeMerges_53-7B-model_stock is the result of merging multiple 7B-parameter models using LazyMergekit, possessing powerful text generation capabilities.
Large Language Model
Transformers

S
Gille
18
1
Bitnet B1 58 Large
MIT
BitNet b1.58 is a 1-bit large language model with 3 billion parameters, trained on the RedPajama dataset for 100 billion tokens.
Large Language Model
Transformers

B
1bitLLM
10.17k
95
Bitnet B1 58 Xl
MIT
BitNet b1.58 3B is a 1-bit quantized large language model trained on 100 billion tokens from the RedPajama dataset, significantly reducing computational resource requirements while maintaining performance.
Large Language Model
Transformers

B
1bitLLM
10.64k
34
- 1
- 2
Featured Recommended AI Models